Goto

Collaborating Authors

 communication requirement


DeMo: Decoupled Momentum Optimization

Peng, Bowen, Quesnelle, Jeffrey, Kingma, Diederik P.

arXiv.org Artificial Intelligence

Training large neural networks typically requires sharing gradients between accelerators through specialized high-speed interconnects. Drawing from the signal processing principles of frequency decomposition and energy compaction, we demonstrate that synchronizing full optimizer states and model parameters during training is unnecessary. By decoupling momentum updates and allowing controlled divergence in optimizer states across accelerators, we achieve improved convergence compared to state-of-the-art optimizers. We introduce {\textbf{De}}coupled {\textbf{Mo}}mentum (DeMo), a fused optimizer and data parallel algorithm that reduces inter-accelerator communication requirements by several orders of magnitude. This enables training of large neural networks even with limited network bandwidth and heterogeneous hardware. Our method is topology-agnostic and architecture-independent and supports scalable clock-synchronous distributed training with negligible compute and memory overhead. Empirical results show that models trained with DeMo match or exceed the performance of equivalent models trained with AdamW, while eliminating the need for high-speed interconnects when pre-training large scale foundation models. An open source reference PyTorch implementation is published on GitHub at https://github.com/bloc97/DeMo


Improving accuracy and convergence of federated learning edge computing methods for generalized DER forecasting applications in power grid

Nair, Vineet Jagadeesan, Pereira, Lucas

arXiv.org Artificial Intelligence

This proposal aims to develop more accurate federated learning (FL) methods with faster convergence properties and lower communication requirements, specifically for forecasting distributed energy resources (DER) such as renewables, energy storage, and loads in modern, low-carbon power grids. This will be achieved by (i) leveraging recently developed extensions of FL such as hierarchical and iterative clustering to improve performance with non-IID data, (ii) experimenting with different types of FL global models well-suited to time-series data, and (iii) incorporating domain-specific knowledge from power systems to build more general FL frameworks and architectures that can be applied to diverse types of DERs beyond just load forecasting, and with heterogeneous clients.


Educating for AI Cybersecurity Work and Research: Ethics, Systems Thinking, and Communication Requirements

Matei, Sorin Adam, Bertino, Elisa

arXiv.org Artificial Intelligence

The present study explored managerial and instructor perceptions of their freshly employed cybersecurity workers' or students' preparedness to work effectively in a changing cybersecurity environment that includes AI tools. Specifically, we related perceptions of technical preparedness to ethical, systems thinking, and communication skills. We found that managers and professors perceive preparedness to use AI tools in cybersecurity to be significantly associated with all three non-technical skill sets. Most important, ethics is a clear leader in the network of relationships. Contrary to expectations that ethical concerns are left behind in the rush to adopt the most advanced AI tools in security, both higher education instructors and managers appreciate their role and see them closely associated with technical prowess. Another significant finding is that professors over-estimate students' preparedness for ethical, system thinking, and communication abilities compared to IT managers' perceptions of their newly employed IT workers.


FedNet2Net: Saving Communication and Computations in Federated Learning with Model Growing

Kundu, Amit Kumar, Jaja, Joseph

arXiv.org Artificial Intelligence

Federated learning (FL) is a recently developed area of machine learning, in which the private data of a large number of distributed clients is used to develop a global model under the coordination of a central server without explicitly exposing the data. The standard FL strategy has a number of significant bottlenecks including large communication requirements and high impact on the clients' resources. Several strategies have been described in the literature trying to address these issues. In this paper, a novel scheme based on the notion of "model growing" is proposed. Initially, the server deploys a small model of low complexity, which is trained to capture the data complexity during the initial set of rounds. When the performance of such a model saturates, the server switches to a larger model with the help of function-preserving transformations. The model complexity increases as more data is processed by the clients, and the overall process continues until the desired performance is achieved. Therefore, the most complex model is broadcast only at the final stage in our approach resulting in substantial reduction in communication cost and client computational requirements. The proposed approach is tested extensively on three standard benchmarks and is shown to achieve substantial reduction in communication and client computation while achieving comparable accuracy when compared to the current most effective strategies.


Report 84-14 A Variable Supply Model for Distributing

AI Classics

Multiple processors can be used to achieve a speedup of a backward-chaining deduction by distributing or-parallel deductions. However, the actual speedup obtained is strongly dependent on the task allocation strategy. Also, communication cost can be a significant part of the overall cost of a deduction. For the multiple processor scenario used in this paper,, processors with replicated databases on a broadcast network, a variable supply model (VSM) is presented. VSM represents an infinite class of strategies with varying communication requirements.


Sigma Point Belief Propagation

Meyer, Florian, Hlinka, Ondrej, Hlawatsch, Franz

arXiv.org Artificial Intelligence

The sigma point (SP) filter, also known as unscented Kalman filter, is an attractive alternative to the extended Kalman filter and the particle filter. Here, we extend the SP filter to nonsequential Bayesian inference corresponding to loopy factor graphs. We propose sigma point belief propagation (SPBP) as a low-complexity approximation of the belief propagation (BP) message passing scheme. SPBP achieves approximate marginalizations of posterior distributions corresponding to (generally) loopy factor graphs. It is well suited for decentralized inference because of its low communication requirements. For a decentralized, dynamic sensor localization problem, we demonstrate that SPBP can outperform nonparametric (particle-based) BP while requiring significantly less computations and communications.


A Scalable Message-Passing Algorithm for Supply Chain Formation

Penya-Alba, Toni (Instituto de Investigación en Inteligencia Artificial (IIIA) Consejo Superior de Investigaciones Cientificas (CSIC)) | Vinyals, Meritxell (University of Verona) | Cerquides, Jesus (Instituto de Investigación en Inteligencia Artificial (IIIA) Consejo Superior de Investigaciones Cientificas (CSIC)) | Rodriguez-Aguilar, Juan A. (Instituto de Investigación en Inteligencia Artificial (IIIA) Consejo Superior de Investigaciones Cientificas (CSIC))

AAAI Conferences

Supply Chain Formation (SCF) is the process of determining the participants in a supply chain, who will exchange what with whom, and the terms of the exchanges. Decentralized SCF appears as a highly intricate task because agents only possess local information and have limited knowledge about the capabilities of other agents. The decentralized SCF problem has been recently cast as an optimization problem that can be efficiently approximated using max-sum loopy belief propagation. Along this direction, in this paper we propose a novel encoding of the problem into a binary factor graph (containing only binary variables) as well as an alternative algorithm. We empirically show that our approach allows to significantly increase scalability, hence allowing to form supply chains in market scenarios with a large number of participants and high competition.